Add automatic column synchronization for Excel-to-table syncs#210
Add automatic column synchronization for Excel-to-table syncs#210nabilbotpress wants to merge 12 commits intomasterfrom
Conversation
|
|
||
| actions: { | ||
| syncExcelFile: async ({ ctx, input, logger }) => { | ||
| syncExcelFile: async ({ ctx, input, logger, client }) => { |
There was a problem hiding this comment.
issue: Could you separate the action implementation into a separate file. Refer to other integrations
| const realClient = getRealClient(client) | ||
|
|
||
| const { sharepointFileUrl, sheetName, processAllSheets, sheetTableMapping } = input as any | ||
| const { sharepointFileUrl, sheetTableMapping, processAllSheets } = input |
There was a problem hiding this comment.
question: Could you clarify what the purpose of processAllSheets is?
There was a problem hiding this comment.
issue: Please refrain from using nested try/catch blocks
There was a problem hiding this comment.
suggestion: You can use this function to meaningfully extract error message:
import { isAxiosError } from 'axios'
import { ZodError, ZodIssue } from '@botpress/sdk'
const formatZodErrors = (issues: ZodIssue[]) =>
'Validation Error: ' +
issues
.map((issue) => {
const path = issue.path?.length ? `${issue.path.join('.')}: ` : ''
return path ? `${path}${issue.message}` : issue.message
})
.join('\n')
export const getErrorMessage = (err: unknown): string => {
if (isAxiosError(err)) {
// server dependent error
const status = err.response?.status
const data = err.response?.data
// always present
const message = err.message
if (typeof data === 'string' && data.trim()) {
return status ? `${data} (Status: ${status})` : data
}
return status ? `${message} (Status: ${status})` : message
}
if (err instanceof ZodError) {
return formatZodErrors(err.errors)
}
if (err instanceof Error) {
return err.message
}
if (typeof err === 'string') {
return err
}
if (err && typeof err === 'object' && 'message' in err) {
const message = (err as { message: unknown }).message
if (typeof message === 'string') {
return message
}
}
return 'An unexpected error occurred'
}
| continue | ||
| } | ||
|
|
||
| // Check for duplicate headers |
There was a problem hiding this comment.
issue: This code is hard to follow. Could you refactor it?
|
|
||
| if (rowsData.length > 0 && tableId) { | ||
| logger.forBot().info(`Clearing all existing rows from table "${tableNameForSheet}" (ID: ${tableId}).`) | ||
| try { |
There was a problem hiding this comment.
suggestion: Please try to use one try/catch block
| logger | ||
| .forBot() | ||
| .warn( | ||
| `Could not fetch table schema for "${tableNameForSheet}": ${error.message}. Will use string types for all columns.` |
There was a problem hiding this comment.
question: Will silent fail prevent user from knowing if tehre is an issue?
|
|
||
| // Log schema changes (only adding columns, not removing) | ||
| if (columnsToAdd.length > 0 || columnsToRemove.length > 0) { | ||
| logger |
There was a problem hiding this comment.
question: Should we log or throw errors in these checks?
| }) | ||
|
|
||
| // Preserve columns that are no longer in Excel (to maintain KB links) | ||
| columnsToRemove.forEach((col) => { |
There was a problem hiding this comment.
issue: This code is very hard to follow. Could you refactor it to make more readable?
- Fix schema not being fetched for newly created tables, causing numeric columns to be converted to strings - Add duplicate column header detection with clear error messages - Add processAllSheets input parameter to control error handling behavior - Update examples to use valid table names ending with 'Table' - Remove unused variables and type casting
- Separate action implementation into dedicated actions/ directory - Add getErrorMessage utility for consistent error handling across all catch blocks - Remove nested try/catch blocks by extracting focused helper functions - Refactor schema update logic into updateTableSchema() function - Refactor row insertion logic into insertTableRows() function - Extract duplicate header validation into validateUniqueHeaders() function - Remove processAllSheets parameter - always continue on errors and return failedSheets[] in output - Make schema fetch operations fail fast instead of using fallback schemas - Fix misleading logs about preserving removed columns (they are actually deleted) - Use `unknown` type for error handling instead of `any`
b561b0e to
aee0435
Compare
|
Suggestion: I would ask @SimonNRisk about opting in to using the global pnpm workspace |
ptrckbp
left a comment
There was a problem hiding this comment.
syncExcelFile needs to be simplified. There's too much going on in it. It's hard to follow or make sure all is well.
| export default new IntegrationDefinition({ | ||
| name: 'plus/sharepoint-excel', | ||
| version: '2.2.1', | ||
| version: '2.3.1', |
There was a problem hiding this comment.
question: would 2.3.0 work?
There was a problem hiding this comment.
yup, changed this to 2.3.0
| privateKey: z.string().min(1).describe('PEM‑formatted certificate private key'), | ||
| primaryDomain: z.string().min(1).describe('SharePoint primary domain (e.g. contoso)'), | ||
| siteName: z.string().min(1).describe('SharePoint site name'), | ||
| personalAccessToken: z.string().min(1).describe('Botpress Personal Access Token (PAT) for Tables API access'), |
There was a problem hiding this comment.
praise: removing unneeded permissions is great!
| .string() | ||
| .describe( | ||
| 'Map sheets to tables. Format: \'Sheet1:table1,Sheet2:table2\' or JSON: \'{"Sheet1":"table1","Sheet2":"table2"}\'' | ||
| 'Map sheets to tables. Format: \'Sheet1:productsTable,Sheet2:ordersTable\' or JSON: \'{"Sheet1":"productsTable","Sheet2":"ordersTable"}\'. Table names must end with \'Table\'.' |
There was a problem hiding this comment.
question: why was this change made? Is this integration specific to products and orders?
There was a problem hiding this comment.
We needed the table at the end of the table name, but I changed this to be more generic.
| error: z.string(), | ||
| }) | ||
| ) | ||
| .optional() |
There was a problem hiding this comment.
question: why remove the optional from processedSheets but not failedSheets? Both could be empty arrays and processed that way (or null / undefined)
| continue | ||
| } | ||
|
|
||
| const jsonData = xlsx.utils.sheet_to_json(worksheet, { header: 1 }) |
There was a problem hiding this comment.
todo: rename this to jsonArray or something indicating it's a list. this is not intuitive enough.
| return rowsToInsert.length | ||
| } | ||
|
|
||
| export const syncExcelFile: bp.IntegrationProps['actions']['syncExcelFile'] = async ({ |
There was a problem hiding this comment.
todo: split this function into parts that are easier to read. This is doing too much. It's getting, then looping over data and processing, then batch-sending it to botpress. Clearing tables sometimes.
There was a problem hiding this comment.
Agreed, simplified this to be easier to read.
- Split syncExcelFile into focused helper functions (parseSheetTableMapping, ensureTableExists, processSheet) - Standardize documentation examples to use generic table names - Make output schema consistent by always returning both processedSheets and failedSheets arrays
cfc4987 to
8fa2736
Compare
| import { getErrorMessage } from '../utils' | ||
|
|
||
| type Logger = Parameters<bp.IntegrationProps['actions']['syncExcelFile']>[0]['logger'] | ||
| type RealClient = any |
There was a problem hiding this comment.
issue: Here and below, please refrain from using any unless absolutely necessary
No description provided.